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Report  Title 

Data  Representation:  Learning  Kernels  from  Noisy  Data  and  Uncertain  Information 

ABSTRACT 

Identifying  appropriate  data  representation  is  critical  to  many 
decision  making  problems.  In  this  project,  we  focus  on  learning 
kernel-based  data  representation  from  noisy  data  and  uncertain 
infonnation.  Unlike  conventional  approaches  that  represent  objects 
by  vectors,  kernel  representation  defines  a  pairwise  similarity 
between  two  objects,  and  is  convenient  for  representing  complex 
objects  like  graphs.  Although  many  studies  are  devoted  to  learning 
kernel  representation,  none  of  them  addresses  the  challenge  of 
learning  kernel  representation  from  noisy  data  and  uncertain 
infonnation.  The  proposed  research  aims  to  address  this  challenging 
problem  by  developing  (i)  a  kernel  learning  framework  that  are 
robust  to  data  noise  and  information  uncertainty,  and  (ii)  efficient 
algorithms  to  solve  the  related  optimization  problems.  The  proposed 
algorithms  will  be  evaluated  in  the  object  recognition  domain.  The 
impact  of  the  proposed  research  to  the  US  Army  is  significant.  To 
counter  against  future  threats  to  the  safety  and  security  of  our 
society,  we  need  to  enhance  our  capabilities  to  detect,  locate,  and 
track  such  threats  by  extracting  and  representing  data  from  noisy 
observation  and  uncertain  information.  The  proposed  research  seeks 
to  significantly  advance,  both  theoretically  and  computationally, 
the  representation  and  modeling  of  infonnation  from  noisy  and 
uncertain  sources. 
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Data  Representation:  Learning  Kernels  from  Noisy  Data  and  Uncertain  Information 

Proposal  Number:  56976-NS 
Rong  Jin  and  Anil  Jain,  Michigan  State  University 


Statement  of  Problem:  Identifying  appropriate  data  representation  is  critical  to  many  problems  in  pattern  recognition,  data  mining, 
and  machine  learning.  Compared  to  the  vector-based  representation,  kernel-based  data  representation  is  more  flexible  and  is 
particularly  suitable  for  complex  objects  like  trees  and  graphs  that  are  difficult  to  be  captured  by  vector-based  representation.  In  this 
project,  we  focus  on  the  problem  of  automatically  learning  kernel-based  data  representation  from  noisy  data  and  uncertain 
information.  This  is  contrast  to  most  current  studies  on  kernel  learning  that  assume  an  ideal  observation  or  sensing  of  objects  without 
any  noise.  The  objective  of  this  project  is  to  develop  efficient  computational  frameworks  for  learning  a  robust  combination  of 
multiple  kernel  data  representations  from  noisy  data  observations  and  uncertain  supervised  information.  The  proposed  research  aims 
to  develop  the  following  approaches  to  address  the  key  challenges  in  kernel  learning  with  noisy  data  and  uncertain  information 

1 .  Develop  an  efficient  computational  framework  for  multiple  kernel  learning  that  is  resilient  to  the  noise  in  data  observation 

2.  Develop  an  efficient  computational  framework  for  multiple  kernel  learning  that  is  robust  to  the  uncertainty  in  class  assignment 

Significance:  This  project  addresses  one  fundamental  problem  in  pattern  recognition  and  machine  learning,  i.e.,  how  to  derive 
accurate  data  representation  from  noisy  observation  and  uncertain  side  information.  The  result  of  this  result  will  lead  to  significant 
progress  in  kernel  learning,  a  critical  component  to  many  pattern  recognition  and  machine  learning  algorithms  and  theories.  Given  the 
growing  threats  of  global  terrorism  and  illegal  activities  such  as  transportation  of  hazardous  materials  and  human  trafficking,  the 
result  of  this  research  will  significantly  advance,  both  theoretically  and  computationally,  the  representation  and  modeling  of 
information  from  noisy  and  uncertain  sources,  which  in  return  improves  our  capabilities  to  detect,  locate,  and  track  various  threats. 
The  results  of  this  research  will  benefit  the  Army  by  expanding  the  wealth  of  infonnation  that  can  be  utilized  in  a  network-centric 
environment  to  support  effective  and  reliable  decision  making  during  combat  missions  and  in  the  global  war  on  terrorism.  The 
theoretical  and  computational  advances  proposed  in  this  project  are  also  important  key  steps  toward  enabling  the  Anny  to  gain 
information  superiority,  which  is  crucial  to  ensure  the  success  of  its  future  missions. 

Summary  of  the  Most  Important  Results: 

1.  Online  kernel  learning.  Although  a  large  number  of  studies  are  devoted  to  kernel  learning,  most  of  them  suffer  from  the  high 
computational  cost,  making  them  inefficient  for  handling  a  number  of  training  examples.  We  address  this  challenge  by  developing  an 
online  learning  theory  for  kernel  learning.  Compared  to  the  existing  approaches,  online  kernel  learning  is  computationally  more 
efficient  as  it  only  needs  to  scan  the  entire  set  of  training  examples  once.  Online  kernel  learning  is  generally  more  challenging  than 
typical  online  learning  because  it  requires  learning  both  the  kernel  classifiers  and  their  combination  weights  simultaneously.  We  have 
developed  both  detenninistic  approaches  and  stochastic  approaches  for  online  kernel  learning.  We  have  derived  mistake  bounds  for 
both  algorithms.  This  work  has  been  published  in  the  proceeding  of  Algorithmic  Learning  Theory  (ALT)  2010  [1]. 


2.  Kernel  learning  from  noisy  side  information.  Most  studies  on  kernel  learning  assume  that  the  side  information,  such  as  pairwise 
constraints,  is  perfect  without  any  error.  In  this  project,  we  examine  the  problem  of  kernel  learning  from  noisy  side  information  in  the 
form  of  pairwise  constraints.  We  emphasize  that  this  is  an  important  problem  because  pairwise  constraints  are  often  extracted  from 
data  sources  such  as  paper  citations,  and  therefore  are  usually  noisy  and  inaccurate.  To  address  this  challenging  problem,  we 
introduce  the  Generalized  Maximum  Entropy  (GME)  model  and  propose  a  framework  for  learning  a  combination  of  kernels  from 
noisy  side  information  based  on  the  GME  model.  The  theoretic  analysis  shows  that  under  appropriate  assumptions,  the  classification 
model  trained  from  the  noisy  side  information  can  be  very  close  to  the  one  trained  from  the  perfect  side  information.  Extensive 
empirical  studies  verify  the  effectiveness  of  the  proposed  framework.  This  work  has  been  published  in  International  Conference  on 
Machine  Learning  (ICML)  2010  [2], 

3.  Unsupervised  kernel  classification.  We  study  the  problem  of  building  the  kernel  classifier  for  a  target  class  in  the  absence  of  any 
labeled  training  example  for  that  class.  To  address  this  difficult  learning  problem,  we  extend  the  idea  of  transfer  learning  by  assuming 
that  the  following  side  information  is  available:  (i)  a  collection  of  labeled  examples  belonging  to  other  classes  in  the  problem  domain, 
called  the  auxiliary  classes;  (ii)  the  class  infonnation  including  the  prior  of  the  target  class  and  the  correlation  between  the  target  class 
and  the  auxiliary  classes.  Our  goal  is  to  construct  a  kernel  classifier  for  the  target  class  by  leveraging  the  above  data  and  information. 
Our  framework  is  based  on  the  generalized  maximum  entropy  model  that  is  effective  in  transferring  the  label  infonnation  of  the 
auxiliary  classes  to  the  target  class.  A  theoretical  analysis  shows  that  under  certain  assumption,  the  classification  model  obtained  by 
the  proposed  approach  converges  to  the  optimal  model  when  it  is  learned  from  the  labeled  examples  for  the  target  class.  Empirical 
study  on  text  categorization  over  four  different  data  sets  verifies  the  effectiveness  of  the  proposed  approach.  This  work  has  been 
published  in  ACM  Conference  on  Knowledge  Discovery  and  Data  Mining  (KDD),  2010 
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